187 research outputs found
An empirical analysis of the shift and scale parameters in BatchNorm
Batch Normalization (BatchNorm) is a technique that improves the training of deep neural networks, especially Convolutional Neural Networks (CNN). It has been empirically demonstrated that BatchNorm increases performance, stability, and accuracy, although the reasons for such improvements are unclear. BatchNorm includes a normalization step as well as trainable shift and scale parameters. In this paper, we empirically examine the relative contribution to the success of BatchNorm of the normalization step, as compared to the re-parameterization via shifting and scaling. To conduct our experiments, we implement two new optimizers in PyTorch, namely, a version of BatchNorm that we refer to as AffineLayer, which includes the re-parameterization step without normalization, and a version with just the normalization step, that we call BatchNorm-minus. We compare the performance of our AffineLayer and BatchNorm-minus implementations to standard BatchNorm, and we also compare these to the case where no batch normalization is used. We experiment with four ResNet architectures (ResNet18, ResNet34, ResNet50, and ResNet101) over a standard image dataset and multiple batch sizes. Among other findings, we provide empirical evidence that the success of BatchNorm may derive primarily from improved weight initialization
Computer-aided diagnosis of low grade endometrial stromal sarcoma (LGESS)
Low grade endometrial stromal sarcoma (LGESS) accounts for about 0.2% of all uterine cancer cases. Approximately 75% of LGESS patients are initially misdiagnosed with leiomyoma, which is a type of benign tumor, also known as fibroids. In this research, uterine tissue biopsy images of potential LGESS patients are preprocessed using segmentation and stain normalization algorithms. We then apply a variety of classic machine learning and advanced deep learning models to classify tissue images as either benign or cancerous. For the classic techniques considered, the highest classification accuracy we attain is about 0.85, while our best deep learning model achieves an accuracy of approximately 0.87. These results clearly indicate that properly trained learning algorithms can aid in the diagnosis of LGESS
A Natural Language Processing Approach to Malware Classification
Many different machine learning and deep learning techniques have been
successfully employed for malware detection and classification. Examples of
popular learning techniques in the malware domain include Hidden Markov Models
(HMM), Random Forests (RF), Convolutional Neural Networks (CNN), Support Vector
Machines (SVM), and Recurrent Neural Networks (RNN) such as Long Short-Term
Memory (LSTM) networks. In this research, we consider a hybrid architecture,
where HMMs are trained on opcode sequences, and the resulting hidden states of
these trained HMMs are used as feature vectors in various classifiers. In this
context, extracting the HMM hidden state sequences can be viewed as a form of
feature engineering that is somewhat analogous to techniques that are commonly
employed in Natural Language Processing (NLP). We find that this NLP-based
approach outperforms other popular techniques on a challenging malware dataset,
with an HMM-Random Forrest model yielding the best results
Malware Classification with GMM-HMM Models
Discrete hidden Markov models (HMM) are often applied to malware detection
and classification problems. However, the continuous analog of discrete HMMs,
that is, Gaussian mixture model-HMMs (GMM-HMM), are rarely considered in the
field of cybersecurity. In this paper, we use GMM-HMMs for malware
classification and we compare our results to those obtained using discrete
HMMs. As features, we consider opcode sequences and entropy-based sequences.
For our opcode features, GMM-HMMs produce results that are comparable to those
obtained using discrete HMMs, whereas for our entropy-based features, GMM-HMMs
generally improve significantly on the classification results that we have
achieved with discrete HMMs
Hidden Markov Models with Random Restarts vs Boosting for Malware Detection
Effective and efficient malware detection is at the forefront of research
into building secure digital systems. As with many other fields, malware
detection research has seen a dramatic increase in the application of machine
learning algorithms. One machine learning technique that has been used widely
in the field of pattern matching in general-and malware detection in
particular-is hidden Markov models (HMMs). HMM training is based on a hill
climb, and hence we can often improve a model by training multiple times with
different initial values. In this research, we compare boosted HMMs (using
AdaBoost) to HMMs trained with multiple random restarts, in the context of
malware detection. These techniques are applied to a variety of challenging
malware datasets. We find that random restarts perform surprisingly well in
comparison to boosting. Only in the most difficult "cold start" cases (where
training data is severely limited) does boosting appear to offer sufficient
improvement to justify its higher computational cost in the scoring phase
- …